Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

microcloud: Subnet sharing warning should check interfaces not IPs #522

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

gabrielmougard
Copy link
Contributor

closes #516

@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch 5 times, most recently from 8e01972 to 3c0887c Compare December 2, 2024 18:36
Copy link
Contributor

@roosterfish roosterfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for moving the existing check from IPs to interfaces. Please have a look at my last comment, I think we should check the other two remaining ifaces as well.

service/lxd_config.go Outdated Show resolved Hide resolved
service/lxd_config.go Outdated Show resolved Hide resolved
service/lxd_config.go Outdated Show resolved Hide resolved
cmd/microcloud/ask.go Outdated Show resolved Hide resolved
cmd/microcloud/ask.go Outdated Show resolved Hide resolved
cmd/microcloud/ask.go Outdated Show resolved Hide resolved
cmd/microcloud/main_init.go Outdated Show resolved Hide resolved
cmd/microcloud/main_init.go Outdated Show resolved Hide resolved
@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch 2 times, most recently from 9cec54a to eb8c057 Compare December 3, 2024 11:49
@roosterfish
Copy link
Contributor

Not related to the PR but since we have switched the runners to the free ones this error seems to become peristent. It's probably caused by this so I am wondering if the test bed reset is just failing here.

Copy link
Contributor

@masnax masnax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one suggestion to reduce complexity & prevent having lots of extra fields & types.

Comment on lines 917 to 929
// ipWithIface is a helper struct to store an IP address and
// its corresponding network interface.
type ipWithIface struct {
ip net.IP
ifaceName string
}

Copy link
Contributor

@masnax masnax Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of maintaining both this and several fields for each interface on InitSystem, why not just introduce a new type Network with 3 fields: Interface, IP, Subnet which are populated by the corresponding values from net.Interface.

Then each of OVNGeneveAddr MicroCephInternalSubnet, and MicroCephPublicSubnet can be changed to OVNGeneveNetwork, CephInternalNetwork, CephPublicNetwork respectively.

And finally validateSystems you can check per-cluster member if any local interface names clash, and then across all cluster members if the subnets clash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch from eb8c057 to a9f6ca7 Compare December 5, 2024 10:48
@gabrielmougard
Copy link
Contributor Author

@masnax @roosterfish I reworked the approach based on #522 (comment)

Copy link
Contributor

@roosterfish roosterfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, just saw I never submitted the review. Please have a look when you find time.

cmd/microcloud/ask.go Outdated Show resolved Hide resolved
cmd/microcloud/ask.go Outdated Show resolved Hide resolved
c.systems[sh.Name] = bootstrapSystem

// This is to avoid the situation where the internal network for Ceph has been skipped, but the public network has been set.
// Ceph will automatically set the internal network to the public Ceph network if the internal network is not set, which is not what we want.
// Instead, we still want to keep the internal Ceph network to use the MicroCloud internal network as a default.
if internalCephSubnet == microCloudInternalNetworkAddrCIDR {
bootstrapSystem.MicroCephInternalNetworkSubnet = microCloudInternalNetworkAddrCIDR
microcloudNetworkInterface, err := lxd.FindInterfaceForSubnet(microCloudInternalNetworkAddrCIDR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if InitSystem should grow another MicroCloudInternalNetwork *Network where we can store this information after the user has selected the internal address for MicroCloud?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I had the same thinking I remember... For now, it seems that we don't need it for the current state of things but it might be needed in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was if we now check for collisions in between the OVN and Ceph networks, wouldn't it be worth also including the MicroCloud internal network into those checks?

cmd/microcloud/preseed.go Outdated Show resolved Hide resolved
cmd/microcloud/main_init.go Outdated Show resolved Hide resolved
@roosterfish
Copy link
Contributor

Also requires a rebase now due to the recent CLI changes.

@gabrielmougard
Copy link
Contributor Author

Thanks for the feedback!

@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch from a9f6ca7 to e93d076 Compare January 7, 2025 09:39
@gabrielmougard
Copy link
Contributor Author

@roosterfish updated

cmd/microcloud/main_init.go Show resolved Hide resolved
cmd/microcloud/main_init.go Outdated Show resolved Hide resolved
c.systems[sh.Name] = bootstrapSystem

// This is to avoid the situation where the internal network for Ceph has been skipped, but the public network has been set.
// Ceph will automatically set the internal network to the public Ceph network if the internal network is not set, which is not what we want.
// Instead, we still want to keep the internal Ceph network to use the MicroCloud internal network as a default.
if internalCephSubnet == microCloudInternalNetworkAddrCIDR {
bootstrapSystem.MicroCephInternalNetworkSubnet = microCloudInternalNetworkAddrCIDR
microcloudNetworkInterface, err := lxd.FindInterfaceForSubnet(microCloudInternalNetworkAddrCIDR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was if we now check for collisions in between the OVN and Ceph networks, wouldn't it be worth also including the MicroCloud internal network into those checks?

@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch 3 times, most recently from 9f57467 to 1a8b231 Compare January 9, 2025 15:10
@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch from 1a8b231 to 94488b6 Compare January 27, 2025 15:55
cmd/microcloud/ask.go Outdated Show resolved Hide resolved
cmd/microcloud/main_init.go Outdated Show resolved Hide resolved
@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch from 94488b6 to a3b5fb2 Compare January 29, 2025 15:07
Copy link
Contributor

@roosterfish roosterfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test cases!
The validation still seems to be outputting something else in case of single node MicroCloud. Please have a look at my comments.

cmd/microcloud/ask.go Outdated Show resolved Hide resolved
cmd/microcloud/ask.go Outdated Show resolved Hide resolved
cmd/microcloud/main_init.go Outdated Show resolved Hide resolved
cmd/microcloud/main_init.go Outdated Show resolved Hide resolved
cmd/microcloud/main_init_test.go Show resolved Hide resolved
This type will be used to store a variety of subnet informations and which local network interface
they will use. This will be used both for configuring MicroCloud but for running validation as well (e.g, interface collisions within a member, subnet collisions between members of a cluster)

Signed-off-by: Gabriel Mougard <gabriel.mougard@canonical.com>
This will be needed to get a network interface info from a CIDR subnet notation (e.g, mostly in the case of configuring MicroCloud internal and public networks)

Signed-off-by: Gabriel Mougard <gabriel.mougard@canonical.com>
@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch 3 times, most recently from e9dbb77 to c704f7c Compare February 3, 2025 15:55
@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch from c704f7c to f09fbbe Compare February 3, 2025 16:15
@gabrielmougard
Copy link
Contributor Author

hmm there seems to be a problem.. I'm investigating this

@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch from f09fbbe to 5b8e468 Compare February 4, 2025 14:17
@roosterfish
Copy link
Contributor

hmm there seems to be a problem.. I'm investigating this

Please let me know when I should have another look.

@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch from 5b8e468 to 673c61f Compare February 4, 2025 14:39
…MicroCephPublicNetwork` and `MicroCephInternalNetwork` instead of `OVNGeneveAddr`, `MicroCephPublicNetworkSubnet` and `MicroCephInternalNetworkSubnet`

Signed-off-by: Gabriel Mougard <gabriel.mougard@canonical.com>
This function detect the local network interface collisions and the global subet collisions

Signed-off-by: Gabriel Mougard <gabriel.mougard@canonical.com>
Signed-off-by: Gabriel Mougard <gabriel.mougard@canonical.com>
@gabrielmougard gabrielmougard force-pushed the fix/subnet-sharing-warning branch from 673c61f to 2a49acf Compare February 4, 2025 15:19
Copy link
Contributor

@roosterfish roosterfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, looks good already, only a few more smaller comments. Let's have a chat tomorrow morning to go through the details and especially around the verbosity.

@@ -388,9 +407,9 @@ func (c *initConfig) addPeers(sh *service.Handler) (revert.Hook, error) {
CephConfig: info.MicroCephDisks,
}

if info.OVNGeneveAddr != "" {
if info.OVNGeneveNetwork != nil && info.OVNGeneveNetwork.IP != nil && len(info.OVNGeneveNetwork.IP) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a case in which info.OVNGeneveNetwork != nil is not nil but the other fields aren't set? I wonder if the list of checks can be reduced to just the first one.

@@ -694,12 +712,12 @@ func (c *initConfig) setupCluster(s *service.Handler) error {

if s.Type() == types.MicroCeph {
microCephBootstrapConf := make(map[string]string)
if bootstrapSystem.MicroCephInternalNetworkSubnet != "" {
microCephBootstrapConf["ClusterNet"] = bootstrapSystem.MicroCephInternalNetworkSubnet
if bootstrapSystem.MicroCephInternalNetwork != nil && bootstrapSystem.MicroCephInternalNetwork.Subnet != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, do we need to check for more than MicroCephInternalNetwork != nil?

@@ -709,8 +727,8 @@ func (c *initConfig) setupCluster(s *service.Handler) error {

if s.Type() == types.MicroOVN {
microOvnBootstrapConf := make(map[string]string)
if bootstrapSystem.OVNGeneveAddr != "" {
microOvnBootstrapConf["ovn-encap-ip"] = bootstrapSystem.OVNGeneveAddr
if bootstrapSystem.OVNGeneveNetwork != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above you are checking the field you are trying to access so to keep things consistent I would also check for OVNGeneveNetwork.IP before accessing it or just check for OVNGeneveNetwork.

If really only one of the fields is set this seems to be an error in my view and we should exit out or report this.

@@ -1095,14 +1104,34 @@ func (p *Preseed) Parse(s *service.Handler, c *initConfig, installedServices map
internalCephNetwork = customTargetCephInternalNetwork
}

// Also register the MicroCloud internal network in the bootstrap system information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw askAddress() func is only called in the interactive setup even though I made this request in one of the other comments.
So I would have envisioned this to be set there for both preseed an interactive mode at one place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we have just discussed we might be able to use just askAddress and don't run the collision checks in case of preseed.

cmd/microcloud/preseed.go Show resolved Hide resolved
cmd/microcloud/preseed.go Show resolved Hide resolved
cmd/microcloud/main_init.go Show resolved Hide resolved
cmd/microcloud/main_init.go Show resolved Hide resolved
func (c *initConfig) validateSystems(s *service.Handler) (err error) {
warnings := detectCollisions(c.systems)
if len(warnings) > 0 {
fmt.Println("! Network collision:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point you mentioned yesterday regarding this to be too verbose when you just select the defaults.
I wonder if instead of naming it "collision" we should maybe just print this as general information?
In the docs we recommend a dual-port network card so it's indeed weird if you then get a warning which might suggest you are doing something wrong.

if len(warnings) > 0 {
fmt.Println("! Network collision:")
for _, warning := range warnings {
fmt.Println(warning)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please use the same indentation as for the error message you get when selecting Ceph disks and being not fault tolerant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed let's use this format using Checking network configuration ... which is more informative and doesn't suggest any misconfiguration:

Checking network configuration ...
 Ceph cluster network, Ceph public network, MicroCloud internal network, OVN underlay sharing interface "enp5s0" on "micro01"
 Ceph cluster network, Ceph public network sharing subnet "10.87.144.0/24"
 MicroCloud internal network, OVN underlay sharing subnet "fd42:b2d8:b833:2722:1ad4:f4c4:2472:7267/128"
Initializing new services ...
 Local MicroCloud is ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Subnet sharing warning should check interfaces not IPs
3 participants